%/* ----------------------------------------------------------- */
%/*                                                             */
%/*                          ___                                */
%/*                       |_| | |_/   SPEECH                    */
%/*                       | | | | \   RECOGNITION               */
%/*                       =========   SOFTWARE                  */ 
%/*                                                             */
%/*                                                             */
%/* ----------------------------------------------------------- */
%/* developed at:                                               */
%/*                                                             */
%/*      Speech Vision and Robotics group                       */
%/*      Cambridge University Engineering Department            */
%/*      http://svr-www.eng.cam.ac.uk/                          */
%/*                                                             */
%/* ----------------------------------------------------------- */
%/*         Copyright:                                          */
%/*                                                             */
%/*              2002  Cambridge University                     */
%/*                    Engineering Department                   */
%/*                                                             */
%/*   Use of this software is governed by a License Agreement   */
%/*    ** See the file License for the Conditions of Use  **    */
%/*    **     This banner notice must not be removed      **    */
%/*                                                             */
%

% HTKBook - this chapter written by Dan Povey 01/11/02
%

\newpage
\mysect{HMMIRest}{HMMIRest}

\mysubsect{Function}{HMMIRest-Function}

\index{hmmirest@\htool{HMMIRest}|(}

\htool{HMMIRest} is used to perform a single iteration of Extended Baum-Welch
for discriminative training of HMMs.  Maximum Mutual Information (MMI),
Minimum Phone Error (MPE) and  Minimum Word Error (MWE) are supported.
For further details of the theory and configuration settings for
discriminative training see section~\ref{s:discriminative}. A tutorial example
for discriminative training is given in section~\ref{s:egdiscriminative}.

Discriminative training needs to be initialised with trained models; these
will generally be trained using MLE.  Also essential are phone-marked lattices.
Two sets of lattices are needed: a lattice for the correct transcription of
each training file, and a lattice derived from the recognition of each
training file.  These are called the ``numerator'' and ``denominator''
lattices respectively, a reference to their respective positions in the
MMI objective function, which can be expressed as follows:

\newcommand{\num}{{{\tt num}}}
\newcommand{\den}{{{\tt den}}}
\newcommand{\MMI}{{\tt{mmi}}}
\newcommand{\calO}{ {\bm O} }
\newcommand{\calM}{ {\cal M} }

\begin{equation}
    {\cal F}_\MMI(\lambda) = \sum_{r=1}^R\log \frac { P^\kappa(\calO_r | \calM^\num_r) } { P^\kappa(\calO_r | \calM^\den_r) },
  \label{eq:mmimodels}
\end{equation}
where $\lambda$ is the HMM parameters,  $\kappa$ is a scale on the likelihoods, $\calO_r$ is the
speech data for the $r$'th training file and $\calM^\num_r$ and $\calM^\den_r$ are the numerator
and denominator models.  

The MPE/MWE criteria are examples of minimum Bayes' risk training. MPE can be expressed as
\begin{eqnarray}
{\cal F}_{\tt mpe}(\lambda) = \sum_{r=1}^R\sum_{\cal H}
\left(\frac{P^\kappa(\bm{O}^r|{\cal M}_{\cal H})}
{P^\kappa(\bm{O}^r|{\cal M}^{\tt den}_r)}\right)
{\cal L}({\cal H},{\cal H}^r_{\tt ref})
\end{eqnarray}
where ${\cal M}_{\cal H}$ is the acoustic model for hypothesis
${\cal H}$. ${\cal L}({\cal H},{\cal H}^r_{\tt ref})$ is a loss function
measured at the phone-level. Rather than minimising the phone, or word, error,
the normalised average phone accuracy 
is maximised. This may be expressed as
\begin{eqnarray}
{\hat{\lambda}} = \arg\max_{\lambda}\left\{
1 - \frac{1}{\sum_{r=1}^RQ^r}{\cal F}_{\tt mpe}(\lambda)
\right\}
\end{eqnarray}
where $Q^r$ is the number of phones in the transcription for sequence $r$.
In practice approximate forms of the MMI and normalised average phone accuracy
criteria are optimised. 

The numerator and denominator lattices need to contain phone alignment information
(the start and end times of each phone) and language model likelihoods.
Phone alignment information can be obtained by using \htool{HDecode.mod} with
the letter \texttt{d} included in the lattice format options (specified to \htool{HDecode.mod} by the
\texttt{-q} command line option).  It is important that the language model
applied to the lattice not be too informative. In addition the
pruning beam used to create the lattice should  be large enough that a
reasonable set of confusable hypotheses are represented in the lattice. For
more details of this process see the tutorial
section~\ref{s:egdiscriminative}. 

Having created these lattices using an initial set of models, the
tool \htool{HMMIRest} can be used to re-estimate the HMMs for 
typically 4-8 iterations, using the same set of lattices.   

\htool{HMMIRest} supports multiple mixture Gaussians, tied-mixture
HMMs, multiple data streams, parameter tying within and between models
(but not if means and variances are tied independently), and diagonal
covariance matrices only.


%\htool{HMMIRest} supports variance flooring in the same way as \htool{HERest}.

%It also supports \textit{Single pass retraining}, useful when the parameterisation of
%the front-end (e.g. from MFCC to PLP coefficients) is to be modified.


Like \htool{HERest}, \htool{HMMIRest} includes features to allow parallel
operation where a network of processors is available. When the training set
is large, it can be split into separate chunks that are processed in
parallel on multiple machines/processors, speeding up the training process.
\htool{HMMIRest} can operate in two modes:
\begin{enumerate}
\item Single-pass operation: the data files on the command line are the
  speech files to be used for re-estimation of the HMM set; the re-estimated
  HMM is written by the program to the directory specified by the {\texttt{-M}}
   option.
\item Two-pass operation with data split into \texttt{P} blocks:
   \begin{enumerate}  
     \item First pass: multiple jobs are started with the options \texttt{-p 1}, 
       \texttt{-p 2} $\ldots$ \texttt{-p P} are given; the different subsets of training
     files should be given on each command line or in a file specified by the \texttt{-S}
     option.  Accumulator files \texttt{<dir>/HDR<p>.acc.1} to  \texttt{<dir>/HDR<p>.acc.n} 
      are written, where \texttt{<dir>} is
    the directory specified by the $\texttt{-M}$  and \texttt{<p>} is the number specified
     by the \texttt{-p} option.   The number of accumulates stored depends on
     the criterion and form of prior.
     \item Second pass: \htool{HMMIRest} is called with \texttt{-p 0}, and the
      command-line arguments are the accumulator files written by the first pass.
     The re-estimated HMM set is written to the directory specified by the $\texttt{-M}$ option. 
   \end{enumerate}
\end{enumerate}
For more details of options of this form with \htool{HMMIRest} see section~\ref{s:hmmiresttrain}

If there are a large number of training files, the directories specified for
the numerator and denominator lattice can contain subdirectories containing
the actual lattices.  The name of the subdirectory required can be extracted
from the filenames by adding configuration variables for instance as follows:
\begin{verbatim}
LATMASKNUM = */%%%?????.???
LATMASKDEN = */%%%?????.???
\end{verbatim}
which would convert a filename \texttt{foo/bar12345.lat} to \texttt{<dir>/bar/bar12345.lat},
where \texttt{<dir>} would be the directory specified by the \texttt{-q} or \texttt{-r} option.
The \texttt{\%}'s are converted to the directory name and other characters are discarded.

\mysubsect{Use}{HMMIRest-Use}
 \label{sec:hmmirestuse}

\htool{HMMIRest} is invoked via the command line
\begin{verbatim}
   HMMIRest [options] hmmList trainFile ...
\end{verbatim}
or alternatively if the training is in parallel mode, the second pass is invoked as
\begin{verbatim}
   HMMIRest [options] hmmList accFile ...
\end{verbatim}

The full list of the options accepted by \htool{HMMIRest} is as follows. Some
of the options are marked (\texttt{hist}) to indicate that they were mainly
used for debugging and development experiments, but maintained for backward compatibility.

\begin{optlist}

  \ttitem{-a} Use an input transform to obtain alignments for updating
      models or transforms (default off).

  \ttitem{-d dir} 
      Normally \htool{HMMIRest} looks for HMM definitions
       (not already loaded via MMF files) 
      in the current directory.  This option tells \htool{HMMIRest} to look in
      the directory {\tt dir} to find them.

  \ttitem{-g} (\texttt{hist}) Maximum Likelihood (ML) mode-- Maximum Likelihood lattice-based
   estimation using only the numerator (correct-sentence) lattices.

  \ttitem{-h mask} Set the mask for determining which transform names are 
	to be used for the output transforms.

  \ttitem{-l} (\texttt{hist}) Maximum number of sentences to use (useful only for troubleshooting)

  \ttitem{-o ext} (\texttt{hist}) This causes the file name extensions of the
      original models (if any) to be replaced by {\tt ext}.

  \ttitem{-p N}  Set parallel mode.  1, 2 $\ldots$ N for a forward-backward
     pass on data split into N blocks.  0 for re-estimating using accumulator
    files that this will write.  Default (-1) is single-pass operation.

  \ttitem{-q dir}   Directory to look for numerator lattice files.  These files
      will have a filename the same as the speech file, but with extension
      \texttt{lat} (or as given by \texttt{-Q} option).

  \ttitem{-qp s}   Path pattern for the extended path to look for numerator lattice files. 
      The matched string will be spliced to the end of the directory string specified by
      option `-q' for a deeper path.

  \ttitem{-r dir}  Directory to look for denominator lattice files.

  \ttitem{-rp s}   Path pattern for the extended path to look for denominator lattice files.
     The matched string will be spliced to the end of the directory string specified by
      option `-r' for a deeper path.

  \ttitem{-s file}  File to write HMM statistics.

  \ttitem{-twodatafiles}  Expect two of each data file for single pass retraining.  
    This works as for HERest; command line contains alternating alignment and update
    files.

  \ttitem{-u flags} By default, \htool{HMMIRest} updates all of the HMM parameters,
      that is, means, variances, mixture weights and 
      transition probabilities. This 
      option causes just the parameters indicated by the {\tt flags}
      argument to be updated, this argument is a string containing one
      or more of the letters {\tt m} (mean), {\tt v} (variance) ,
      {\tt t} (transition) and {\tt w} (mixture weight).  The 
      presence of a letter enables
      the updating of the corresponding parameter set.

  \ttitem{-umle flags} (\texttt{hist}) A format as in \texttt{-u}; directs that the specified parameters
     be updated using the ML criterion (for MMI operation only, not MPE/MWE).

  \ttitem{-w floor} (\texttt{hist}) Set the mixture weight floor as a multiple
    of \texttt{MINMIX=1.0e-5}.  Default is 2.

  \ttitem{-x ext}  By default, \htool{HMMIRest} expects a HMM definition for 
      the label X to be stored in a file called {\tt X}.  This
      option causes \htool{HMMIRest} to look for the HMM definition in the
      file {\tt X.ext}.
\stdoptB
\stdoptE
\stdoptF
\stdoptH
\ttitem{-Hprior mmf}  Load HMM macro file {\tt mmf}, to be used as prior model in adaptation.  Use with configuration variables {\tt PRIORTAU} etc.  
\stdoptI
\stdoptJ
\stdoptM

 \ttitem{-Q ext} Set the lattice file extension to {\tt ext}
\end{optlist}

\stdopts{HMMIRest}

\mysubsect{Tracing}{HMMIRest-Tracing}


The command-line trace option can only be set to 0 (trace off)
or 1 (trace on), which is the default.  Tracing behaviour can
be altered by the {\tt TRACE} configuration variables in the modules \htool{HArc}
and \htool{HFBLat}.

\index{hmmirest@\htool{HMMIRest}|)}


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "../htkbook"
%%% End: 
